WIP: Route inline reasoning streams by Hmbown · Pull Request #3446 · Hmbown/CodeWhale

Hmbown · 2026-06-23T04:26:36Z

Summary

add provider-scoped reasoning_stream_style config for separate_field, inline_tags, and none stream behavior
route inline <think>...</think> content into Thinking blocks, including tags split across SSE chunks
document the provider-table key for OpenAI-compatible gateways

Tests

cargo fmt --all -- --check
cargo test -p codewhale-tui --bin codewhale-tui --locked reasoning_style
cargo test -p codewhale-tui --bin codewhale-tui --locked provider_config_loads_reasoning_stream_style
cargo test -p codewhale-tui --bin codewhale-tui --locked stream_decoder_tests
cargo test -p codewhale-tui --bin codewhale-tui --locked stream_reasoning (0 matching tests)
git diff --check

Refs #3222.

Reported with detailed route-shape examples by @buko.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

gemini-code-assist

Code Review

This pull request introduces a new configuration option, reasoning_stream_style, which allows users to customize how streaming reasoning content is handled and separated from the main answer text. It supports separate fields, inline tags (e.g., <think>...</think>), or treating reasoning content as standard text. The changes include updates to client initialization, SSE chunk parsing, configuration loading, documentation, and comprehensive unit tests. The reviewer suggested using fully qualified paths for SystemTime and UNIX_EPOCH in the new configuration test to avoid potential compilation issues if they are not imported.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-23T04:27:44Z

+    let nanos = SystemTime::now()
+        .duration_since(UNIX_EPOCH)
+        .unwrap()
+        .as_nanos();


To prevent potential compilation errors if SystemTime and UNIX_EPOCH are not imported in this test module, it is safer to use their fully qualified paths from std::time.

Suggested change

let nanos = SystemTime::now()

.duration_since(UNIX_EPOCH)

.unwrap()

.as_nanos();

let nanos = std::time::SystemTime::now()

.duration_since(std::time::UNIX_EPOCH)

.unwrap()

.as_nanos();

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

claude · 2026-06-23T09:02:23Z

Claude finished @Hmbown's task in 5m 19s —— View job

PR Review: Route inline reasoning streams

Summary

The routing logic is sound and the state machine for split-chunk tag parsing is correct. A few issues worth addressing before this ships, in priority order:

1. Silent fallback on unrecognized config value — `parse_reasoning_stream_style` (`chat.rs:2165–2173`)

fn parse_reasoning_stream_style(value: &str) -> Option<ReasoningStreamStyle> {
    match value.trim().to_ascii_lowercase().replace('-', "_").as_str() {
        ...
        _ => None,
    }
}

reasoning_stream_style_for_stream calls this and, on None, falls through to the provider/model heuristic. A user who typos "inline_tag" (singular) gets silently ignored — their configured gateway behaviour is then driven by is_reasoning_model_for_stream, which returns SeparateField for reasoning providers and None otherwise. Neither is what they intended. A logging::warn (or even a debug log) when the configured value goes unrecognized would surface this during setup. Fix this →

2. `InlineTags` mode also consumes `reasoning_content` fields (`chat.rs:2742–2751`)

if reasoning_stream_style != ReasoningStreamStyle::None
    && let Some(reasoning) = reasoning_text.as_deref()
{
    push_thinking_delta(...);
}

The condition != None means InlineTags also routes a reasoning_content/reasoning delta to push_thinking_delta. For a gateway configured with inline_tags that also happens to emit reasoning_content (e.g. an OpenRouter passthrough with metadata), both paths activate: reasoning_content starts a Thinking block, then the content string with <think> tags is split by inline_reasoning_segments. The block-transition logic in push_text_delta/push_thinking_delta handles it correctly (no panic, no dropped content), but the downstream consumer sees more thinking-block starts than expected.

If the intent is "InlineTags gateways don't use the separate field", the condition should be == ReasoningStreamStyle::SeparateField. If the intent is "route both if present", that's fine but should be documented. Fix this →

3. Missing test: unclosed `<think>` tag at stream end (`chat.rs:2891–2915`)

The flush path (inline_reasoning_segments("", state, true) at finish_reason) correctly emits remaining pending content as Thinking when inside_think == true. But there is no test for this — a truncated stream ending with an open <think> and no matching </think>. Worth a dedicated #[test] that sends a single chunk with <think> content and no close tag, followed by a finish-reason chunk, asserting the partial reasoning is emitted rather than silently dropped. Fix this →

4. Module-level doc comment names a now-test-only function (`chat.rs:4`)

//! ... SSE parsing (`parse_sse_chunk`) all live here.

parse_sse_chunk is now #[cfg(test)]; the production entry point is parse_sse_chunk_with_reasoning_style. The doc comment should be updated to avoid confusion. Fix this →

False alarm from Gemini

The inline comment suggestion on config/tests.rs:4241 (use std::time::SystemTime::now() fully qualified) is not needed — SystemTime and UNIX_EPOCH are already imported at config/tests.rs:8 via use std::time::{SystemTime, UNIX_EPOCH}.

What's good

trailing_tag_prefix_len — the is_char_boundary guard handles UTF-8 content (e.g. CJK text before a tag) correctly; the O(tag.len()) scan is fine for 7–9 char tags.
push_reasoning_segment empty filter — prevents spurious ContentBlockStart events when a split produces an empty leading/trailing segment.
#[cfg(test)] shim on parse_sse_chunk — clean pattern; the old signature is a thin wrapper around parse_sse_chunk_with_reasoning_style and gated so it doesn't pollute production exports. All callers in client.rs are inside #[cfg(test)] mod tests, so nothing breaks.
Config merge at merge_provider_config — the new field is included correctly with or semantics (override wins, then base).
reasoning_stream_style_for_stream override priority — explicit config beats model/provider heuristic, which is the right default for OpenAI-compatible gateways.
Block transition logic in push_text_delta/push_thinking_delta — correctly stops the opposite block and increments content_index before opening the new one; no index bookkeeping bugs found.

Add provider-scoped reasoning_stream_style handling for chat-completion streams so OpenAI-compatible routes can choose separate_field, inline_tags, or none without a new provider enum variant. Inline <think> blocks are streamed into Thinking content blocks, including tags split across SSE chunks, while none preserves raw visible text and existing separate-field reasoning keeps its default behavior. Refs #3222.

Limit the legacy parse_sse_chunk wrapper to test builds so release and clippy builds use only the style-aware stream parser path. Refs #3222.

Warn when reasoning_stream_style contains an unrecognized configured value, keep inline_tags mode from also consuming separate reasoning_content fields, and cover unclosed <think> stream flushes. Verified with: - cargo fmt --all -- --check - git diff --check - cargo test -p codewhale-tui --bin codewhale-tui --locked reasoning_style - cargo test -p codewhale-tui --bin codewhale-tui --locked provider_config_loads_reasoning_stream_style - cargo test -p codewhale-tui --bin codewhale-tui --locked stream_decoder_tests - cargo test -p codewhale-tui --bin codewhale-tui --locked qianfan_provider_accepts_custom_model_and_base_url - cargo clippy -p codewhale-tui --bin codewhale-tui --locked -- -D warnings -A clippy::uninlined_format_args -A clippy::too_many_arguments -A clippy::unnecessary_map_or -A clippy::assertions_on_constants

claude · 2026-06-23T15:50:28Z

Claude finished @Hmbown's task in 2m 41s —— View job

PR Review: Route inline reasoning streams (updated)

Gather context and understand request
Check what changed in latest commits vs previous review
Read updated diff and key files
Review correctness and logic
Check if previous findings were addressed
Post findings

Follow-up on Previous Findings

The two commits pushed since the initial review (7993d15 and 6900776) fully address all four issues that were flagged. Quick verification:

Finding	Fix
Silent fallback on unrecognized config value	`logging::warn` now emitted in `reasoning_stream_style_for_stream` when `parse_reasoning_stream_style` returns `None`
`InlineTags` also consuming `reasoning_content`	Condition changed from `!= ReasoningStreamStyle::None` → `== ReasoningStreamStyle::SeparateField`; confirmed by new `reasoning_style_inline_tags_ignores_separate_reasoning_field` test
Missing test for unclosed `<think>` at stream end	`reasoning_style_inline_tags_flushes_unclosed_think_at_stream_end` added; flush path confirmed correct
Module doc naming `parse_sse_chunk` (now `#[cfg(test)]`)	Doc comment updated to `parse_sse_chunk_with_reasoning_style`

Fresh Review of Current State

The implementation looks clean. A few observations on the final shape:

State machine (inline_reasoning_segments) — the trailing_tag_prefix_len hold-back loop is correct, including the is_char_boundary guard for multi-byte content. The flush = true path correctly zero-holds and emits everything, so truncated streams don't silently drop content.

Match structure in parse_sse_chunk_with_reasoning_style — the four arms are exhaustive and non-overlapping:

(Some, InlineTags) → segment through inline_reasoning_segments
(Some, _) → plain text (covers SeparateField and None with a content field)
(None, None) → fall through reasoning_text as text (legacy passthrough gateways)
(None, _) → no-op (reasoning field already handled or not applicable)

This correctly preserves the legacy behavior for unknown gateways while adding the two new explicit modes.

parse_sse_chunk test shim — gated #[cfg(test)], creates a fresh InlineReasoningTagState per call (correct for the pre-existing tests that don't test inline tag cross-chunk state). New tests use decode_chunks_with_style which correctly carries state across chunks.

Config — reasoning_stream_style stored as Option<String> on DeepSeekClient and parsed lazily at stream start; merge_provider_config uses or semantics (override wins). Both are correct.

Test coverage — 7 new tests cover: separate field routing, inline tag routing (including split across chunks), unclosed-tag flush, ignoring reasoning_content in InlineTags mode, None passthrough, override priority, and config deserialization.

Minor notes (not blocking):

parse_reasoning_stream_style accepts several undocumented aliases ("inline", "think_tags", "separate", "text", "off", etc.). The doc only mentions the canonical three. These won't cause user confusion, but if the aliases are intentional they're worth a note in CONFIGURATION.md.
The tag match is case-sensitive (<think> only). That's the de-facto standard for these gateways, so it's fine for the current scope.

No new correctness issues found. This is ready to merge.

greptile-apps Bot reviewed Jun 23, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 23, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 23, 2026

View reviewed changes

Hmbown marked this pull request as ready for review June 23, 2026 05:56

greptile-apps Bot reviewed Jun 23, 2026

View reviewed changes

Hmbown force-pushed the issue/3222-inline-reasoning-tags branch from 53ef812 to 3636629 Compare June 23, 2026 08:59

Hmbown added 3 commits June 23, 2026 08:45

Fix inline reasoning stream lint

7993d15

Limit the legacy parse_sse_chunk wrapper to test builds so release and clippy builds use only the style-aware stream parser path. Refs #3222.

Hmbown force-pushed the issue/3222-inline-reasoning-tags branch from 3636629 to 6900776 Compare June 23, 2026 15:50

Hmbown merged commit 2a4c67a into main Jun 23, 2026
15 checks passed

Hmbown deleted the issue/3222-inline-reasoning-tags branch June 23, 2026 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Route inline reasoning streams#3446

WIP: Route inline reasoning streams#3446
Hmbown merged 3 commits into
mainfrom
issue/3222-inline-reasoning-tags

Hmbown commented Jun 23, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 23, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

greptile-apps Bot left a comment

Uh oh!

claude Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

claude Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Hmbown commented Jun 23, 2026

Summary

Tests

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Route inline reasoning streams

Summary

1. Silent fallback on unrecognized config value — parse_reasoning_stream_style (chat.rs:2165–2173)

2. InlineTags mode also consumes reasoning_content fields (chat.rs:2742–2751)

3. Missing test: unclosed <think> tag at stream end (chat.rs:2891–2915)

4. Module-level doc comment names a now-test-only function (chat.rs:4)

False alarm from Gemini

What's good

Uh oh!

claude Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Route inline reasoning streams (updated)

Follow-up on Previous Findings

Fresh Review of Current State

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 23, 2026 •

edited

Loading

1. Silent fallback on unrecognized config value — `parse_reasoning_stream_style` (`chat.rs:2165–2173`)

2. `InlineTags` mode also consumes `reasoning_content` fields (`chat.rs:2742–2751`)

3. Missing test: unclosed `<think>` tag at stream end (`chat.rs:2891–2915`)

4. Module-level doc comment names a now-test-only function (`chat.rs:4`)

claude Bot commented Jun 23, 2026 •

edited

Loading